WC 2021

Program at a Glance

10th World Congress in Probability and Statistics

Plenary Tue-1: IMS Medallion Lecture (Laurent Saloff-Coste) Plenary Tue-2: IMS Medallion Lecture (Elchanan Mossel) Plenary Tue-3: Levy Lecture (Massimiliano Gubinelli) Plenary Tue-4: Doob Lecture (Nicolas Curien)

Invited 04: Mathematical Population Genetics and Computational Statistics (Organizer: Paul Jenkins) Invited 18: Deep Learning (Organizer: Johannes Schmidt-Hieber)

Invited 16: Bootstrap for High-dimensional Data (Organizer: Kengo Kato) Invited 27: Random Matrices and Related Fields (Organizer: Manjunath Krishnapur) Invited 28: Statistical Inference for Graphs and Networks (Organizer: Betsy Ogburn) Invited 31: Information Theory and Concentration Inequalities (Organizer: Chandra Nair)

Invited 02: Scaling Limits of Disordered Systems and Disorder Relevance (Organizer: Rongfeng Sun) Invited 07: High-dimensional Robustness (Organizer: Stanislav Minsker) Invited 08: Functional Data Analysis (Organizer: Aurore Delaigle) Invited 32: Statistical Learning (Organizer: Yichao Wu) Invited 41: Bernoulli Paper Prize Session (Organizer: Bernoulli Society)

Organized 07: Anomalous Diffusions and Related Topics (Organizer: Zhen-Qing Chen) Organized 17: The Advances in Time Series and Spatial Statistics (Organizer: Wei-Ying Wu) Organized 24: Advanced Statistical Methods for Complex Data (Organizer: Jongho Im)

Organized 12: Recent Developments for Dependent Data (Organizer: Mikyoung Jun) Organized 16: Non-Euclidean Statistical Inference (Organizer: Young Kyung Lee)

Organized 06: Theoretical Analysis of Random Walks, Random Graphs and Clustering (Organizer: Ji Oon Lee) Organized 13: Recent Advances in Complex Time Series Analysis (Organizer: Haeran Cho)

Contributed 25: Time Series Analysis II

Contributed 02: Financial Mathematics and Probabilistic Modeling Contributed 07: SDEs and Fractional Brownian Motions Contributed 28: Neural Networks and Deep Learning

Contributed 10: Reflecting Diffusion Processes, Stochastic Networks and Their Applications (Organizer: Amber Puha) Contributed 16: Probability Theory and Statistical Mechanics Contributed 19: Detection and Segmentation Contributed 23: Bayesian Nonparametric Inference

Poster I-1: Poster Session I-1 Poster I-2: Poster Session I-2

Organized Contributed Session (live Q&A at Track 3, 10:30PM KST)

Organized 06

Theoretical Analysis of Random Walks, Random Graphs and Clustering (Organizer: Ji Oon Lee)

Conference

10:30 PM — 11:00 PM KST

Local

Jul 20 Tue, 6:30 AM — 7:00 AM PDT

Spectral large deviations for sparse random matrices

Kyeongsik Nam (University of California, Los Angeles)

The large deviation problem for the spectrum of random matrices has attracted immense interest. It was first studied for GUE and GOE, which are exactly solvable, and subsequently studied for Wigner matrices with general distributions. Once the sparsity is induced (i.e. each entry is multiplied by the independent Bernoulli distribution Ber(p)), eigenvalues can exhibit a drastically different behavior. Constant average degree of sparsity, p~1/n, is an interesting case since universality breaks down at this regime of sparsity. In this talk, I will consider Gaussian ensembles with sparsity p=1/n and talk about the typical behavior and large deviation estimates for the largest eigenvalue. Joint work with Shirshendu Ganguly.

Robust hypergraph clustering via convex relaxation of truncated MLE

Hye Won Chung (Korea Advanced Institute of Science and Technology (KAIST))

We study hypergraph clustering in the weighted $d$-uniform hypergraph stochastic block model ($d$-WHSBM), where each edge consisting of $d$ nodes from the same community has higher expected weight than the edges consisting of nodes from different communities. We propose a new hypergraph clustering algorithm, called CRTMLE, and provide its performance guarantee under the $d$-WHSBM for general parameter regimes. We show that the proposed method achieves the order-wise optimal or the best existing results for approximately balanced community sizes. Moreover, our results settle the first recovery guarantees for growing number of clusters of unbalanced sizes. Involving theoretical analysis and empirical results, we demonstrate the robustness of our algorithm against the unbalancedness of community sizes or the presence of outlier nodes.

Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices

Kevin Schnelli (KTH Royal Institute of Technology)

In this talk we discuss quantitative versions of the edge universality for Wigner matrices and related random matrix models. The fluctuations of the largest rescaled eigenvalues for the Gaussian invariant ensembles are described by the Tracy-Widom laws. The universality of these laws has in recent years been established for many non-invariant random matrix models. We will present new results on the convergence rate to the universal laws for the largest eigenvalues of Wigner matrices and high-dimensional sample covariance matrices.

Attributed graph alignment

Lele Wang (University of British Columbia)

Motivated by various data science applications including de-anonymizing user identities in social networks, we consider the graph alignment problem, where the goal is to identify the vertex/user correspondence between two correlated graphs. Existing work mostly recovers the correspondence by exploiting the user-user connections. However, in many real-world applications, additional information about the users, such as user profiles, might be publicly available. In this paper, we introduce the attributed graph alignment problem, where additional user information, referred to as attributes, is incorporated to assist graph alignment. We establish sufficient and necessary conditions for recovering vertex correspondence exactly, where the conditions match for a wide range of practical regimes. Our results recover existing tight information-theoretic limits for models where only the user-user connections are available, and further span the full spectrum between these models and models where only attribute information is available.

This is joint work with Ning Zhang and Weina Wang.

Minkowski content for the scaling limit of loop-erased random walk in three dimensions

Xinyi Li (Peking University)

In this talk, we will discuss loop-erased random walk in three dimensions and its scaling limit, and briefly explain how to prove the existence of the Minkowski content of the latter and why it gives the scaling limit of the former in natural parametrizaion. This is joint work with Daisuke Shiraishi (Kyoto).

Q&A for Organized Contributed Session 06

This talk does not have an abstract.

Session Chair

Ji Oon Lee (Korea Advanced Institute of Science and Technology (KAIST))

Organized 13

Recent Advances in Complex Time Series Analysis (Organizer: Haeran Cho)

Conference

10:30 PM — 11:00 PM KST

Local

Jul 20 Tue, 6:30 AM — 7:00 AM PDT

Change points detection for high dimensional time series

Likai Chen (Washington University in Saint Louis)

We consider multiple change-points detection of high-dimensional time series. Asymptotic theory for testing the existence of breaks, the consistency and the asymptotic distribution of the breakpoint statistics and estimated break sizes will be provided. The theory backs up a simple two-step procedure for detecting and estimating multiple change-points. The proposed two-step procedure involves the maximum of a MOSUM (moving sum) type statistics in the first step and a CUSUM (cumulative sum) refinement step on an aggregated time series in the second step. Thus, for a fixed time-point, we can capture both the biggest break across different coordinates and aggregating simultaneous breaks over multiple coordinates. Extending the existing high-dimensional Gaussian approximation theorem to dependent data with jumps, the theory allows us to characterize the size and power of our multiple change-point test asymptotically. Moreover, we can make inferences on the breakpoints estimates when the break sizes are small. Our theoretical setup incorporates both weak temporal and strong or weak cross-sectional dependence and is suitable for heavy-tailed innovations. A robust long-run covariance matrix estimation is proposed, which can be of independent interest. An application on detecting structural changes of the U.S. unemployment rate is considered to illustrate the usefulness of our method.

Asymptotics of large autocovariance matrices

Monika Bhattacharjee (Indian Institute of Technology Bombay)

We consider the high-dimensional moving average process and explore the asymptotics for eigenvalues of its sample autocovariance matrices. Under quite weak conditions, we prove, in a unified way, that the limiting spectral distribution (LSD) of any symmetric polynomial in the sample autocovariance matrices, after suitable centering and scaling, exists and is non-degenerate. We use methods from free probability in conjunction with the method of moments to establish our results. In addition, we are able to provide a general description for the limits in terms of some freely independent variables. We also establish asymptotic normality results for the traces of these matrices. We suggest statistical uses of these results in problems such as order determination of high-dimensional MA and AR processes and testing of hypotheses for coefficient matrices of such processes.

Factor models for matrix-valued high-dimensional time series

Xialu Liu (San Diego State University)

In finance, economics and many other fields, observations in a matrix form are often observed over time. For example, many economic indicators are obtained in different countries over time. Various financial characteristics of many companies are reported over time. Although it is natural to turn a matrix observation into a long vector then use standard vector time series models or factor analysis, it is often the case that the columns and rows of a matrix represent different sets of information that are closely interrelated in a very structural way. We propose a novel factor model that maintains and utilizes the matrix structure to achieve greater dimensional reduction as well as finding clearer and more interpretable factor structures. Estimation procedure and its theoretical properties are investigated and demonstrated with simulated and real examples.

Multi-level changepoint inference for periodic data sequences

Anastasia Ushakova (Lancaster University)

Existing changepoint approaches consider changepoints to occur linearly in time; one changepoint happens after another and they are not linked. However, data processes may have regularly occurring changepoints, e.g. a yearly increase in sales of ice-cream on the first hot weekend. Using linear changepoint approaches here will miss more global features such as a decrease in sales of ice-cream in favour of sorbet. Being able to tease these global changepoint features from the more local (periodic) ones is beneficial for inference. We propose a periodic changepoint model to model this behaviour using a mixture of a periodic and linear time perspective. Built around a Reversible Jump Markov Chain Monte Carlo sampler, the Bayesian framework is used to study the local (periodic) changepoint behaviour. To identify the optimal global changepoint positions we integrate the local changepoint model into the pruned exact linear time (PELT) search algorithm. We demonstrate that the method detects both local and global changepoints with high accuracy on simulated and motivating applications that share periodic behaviour. Due to the multi-level nature of the analysis, visualisation of the results can be challenging. We additionally provide a unique multi-level perspective for changepoint visualisations in data sequences.

Q&A for Organized Contributed Session 13

This talk does not have an abstract.

Session Chair

Haeran Cho (University of Bristol)

Program at a Glance

10th World Congress in Probability and Statistics

Organized Contributed Session (live Q&A at Track 3, 10:30PM KST)

Theoretical Analysis of Random Walks, Random Graphs and Clustering (Organizer: Ji Oon Lee)

Spectral large deviations for sparse random matrices

Robust hypergraph clustering via convex relaxation of truncated MLE

Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices

Attributed graph alignment

Minkowski content for the scaling limit of loop-erased random walk in three dimensions

Q&A for Organized Contributed Session 06

Session Chair

Recent Advances in Complex Time Series Analysis (Organizer: Haeran Cho)

Change points detection for high dimensional time series

Asymptotics of large autocovariance matrices

Factor models for matrix-valued high-dimensional time series

Multi-level changepoint inference for periodic data sequences

Q&A for Organized Contributed Session 13

Session Chair